Information Extraction and Sentence Classification applied to Clinical Trial MEDLINE Abstracts

نویسنده

  • Kazuo Hara
چکیده

In this paper, firstly we report experimental results on applying information extraction (IE) methodology to the task of summarizing clinical trial design information in focus on “Compared Treatment”, “Endpoint” and “Patient Population” from clinical trial MEDLINE abstracts. From these results, we have come to see this problem as one that can be decomposed into a sentence classification subtask and an IE subtask. By classifying sentences from clinical trial abstracts and only performing IE on sentences that are most likely to contain relevant information, we hypothesize that the accuracy of information extracted from the abstracts can be increased. As preparation for testing this theory in the next stage, we conducted an experiment applying state-of-the-art sentence classification techniques to the clinical trial abstracts and evaluated its potential in the original task of the summarization of clinical trial design information.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

LSAT: learning about alternative transcripts in MEDLINE

MOTIVATION Generation of alternative transcripts from the same gene is an important biological event due to their contribution in creating functional diversity in eukaryotes. In this work, we choose the task of extracting information around this complex topic using a two-step procedure involving machine learning and information extraction. RESULTS In the first step, we trained a classifier th...

متن کامل

Classification of Clinically Useful Sentences in MEDLINE

OBJECTIVE In a previous study, we investigated a sentence classification model that uses semantic features to extract clinically useful sentences from UpToDate, a synthesized clinical evidence resource. In the present study, we assess the generalizability of the sentence classifier to Medline abstracts. METHODS We applied the classification model to an independent gold standard of high qualit...

متن کامل

Identifying Sections in Scientific Abstracts using Conditional Random Fields

OBJECTIVE: The prior knowledge about the rhetorical structure of scientific abstracts is useful for various text-mining tasks such as information extraction, information retrieval, and automatic summarization. This paper presents a novel approach to categorize sentences in scientific abstracts into four sections, objective, methods, results, and conclusions. METHOD: Formalizing the categorizati...

متن کامل

System for Extraction of Protein Interaction Based on Aging-Related Corpus

Recently the fields of genomics has been explosively developed, but there is less progress made in field of proteomics, which is due to more expensive and time-consuming for proteomics research. Therefore, efficient biology research of proteomics requires information extraction from free-text articles (MEDLINE) such as classification of protein, extraction of protein-protein interaction facts, ...

متن کامل

Information Extraction of Protein Phosphorylation from Biomedical Literature

aBstRact Protein posttranslational modification (PTM) is a fundamental biological process, and currently few text mining systems focus on PTM information extraction. A rule-based text mining system, RLIMS-P (Rule-based LIterature Mining System for Protein Phosphorylation), was recently developed by our group to extract protein substrate, kinase and phosphorylated residue/sites from MEDLINE abst...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2005